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ABSTRACT 

It  is  shown  that  any  form  of  Search  Rearrangement  Backtracking  (SRB)  requires 
exponential  time  to  verify  the  unsatisfiability  of  nearly  all  of  a  wide  class  of  CNF  boolean 
expressions.  This  result  is  based  on  an  input  model  which  generates  n  independent  Jb-literal 
clauses  from  a  set  of  r  boolean  variables.  We  assume  that  k  is  fixed  and  n  and  r  tend  to 
infinity.  The  result  holds  if  limn_oo  n/r(n)  =  A,  is  fixed  and  A  >  ln(2)/(— ln(l  —  2~*)). 
We  also  show  that  SRB  requires  superpolynomial  time  nearly  always  if  A  is  replaced  by 
A(n)  =  o(n1/,lnln(n))  and  limn_oo  A(n)  =  oo  (so  the  superpolynomial  time  result  holds,  for 
example,  if  A(n)  =  (ln(n))^  where  (3  is  any  positive  constant).  We  also  show  that  these 
results  apply  to  any  form  of  the  Davis-Putnam  Procedure. 
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1.  Introduction 


The  Satisfiability  problem  (SAT)  is  the  problem  of  determining  whether  there  exists  an 
assignment  of  values  to  boolean  variables  (a  truth  assignment)  which  causes  a  given  boolean 
expression  I  to  have  value  true.  A  truth  assignment  which  causes  I  to  have  value  true  is 
called  a  satisfying  truth  assignment  and  is  said  to  satisfy  I.  It  is  well  known  that  SAT 
is  NP-hard.  However,  it  has  been  shown  that  some  algorithms  solve  SAT  efficiently  in 
a  probabilistic  sense  under  certain  conditions.  These  conditions  are  determined  by  the 
parameters  of  the  input  models  chosen  for  analysis. 

The  model  we  consider  in  this  paper  and  denoted  M(n,  r,  fc)  is  the  constant  clause  size 
model  for  CNF  boolean  formulas.  An  Instance  of  SAT  generated  according  to  M(n,r,  jfe) 
is  the  conjunction  of  n  fc-literal  clauses  (disjunctions)  each  selected  uniformly  and  with 
replacement  from  the  set  of  ail  fc-literal  clauses  that  can  be  composed  from  r  boolean 
variables  with  the  property  that  no  two  literals  in  a  clause  are  associated  with  the  same 
variable.  We  will  assume  that  fc  is  fixed  (independent  of  n  and  r),  and  we  use  fc— SAT  in 
place  of  SAT  when  referring  to  instances  generated  by  M.  The  problem  of  finding  a  satis¬ 
fying  truth  assignment  for  an  instance  of  fc-SAT  or  verifying  that  no  such  truth  assignment 
exists  is  NP-hard  if  fc  >  3  ([12]).  The  model  M(n,r,  fc)  has  the  interesting  property  [10] 
that  if  n/r  <  ln(2)/(— ln(l  —  2~fc))  then  the  average  number  of  truth  assignments  that 
satisfy  random  instances  of  fc-SAT  is  exponential  in  r  and  if  n/r  >  ln(2)/(— ln(l  —  2~fc)) 
then  almost  all  random  instances  have  no  satisfying  truth  assignments  (that  is,  they  are 
unsatisfiable). 

Surprizingly  simple  and  fast  algorithms  are  very  effective  at  finding  satisfying  truth 
assignments  when  at  least  one  exists  if  instances  of  fc-SAT  are  generated  according  to 
A/(n,r,  fc).  For  example,  the  unit-clause  algorithm  is:  repeatedly  assign  values  arbitrarily 
to  variables  in  random  order  until  some  clause  has  just  one  non-falsified  literal  (a  unit 
clause),  then  assign  to  the  variable  associated  with  that  literal  the  value  which  satisfies  the 
unit  clause  and  repeat  these  two  steps  until  all  variables  have  been  assigned  values.  In  [5] 
it  is  shown  that  the  unit-clause  algorithm  finds  a  satisfying  truth  assignment  for  a  random 
instance  of  fc-SAT  with  bounded  probability  under  M(n,r,  fc)  if  n/r  <  (2*  1  /fc)((fc  - 
l)/(fc  —  2))*'2.  A  generalization  of  the  unit-clause  heuristic  (choose  some  variable  that 
appears  in  a  smallest  clause  and  assign  to  it  the  value  which  satisfies  that  clause)  is  shown 
in  [5]  to  find  a  satisfying  truth  assignment  to  almost  all  random  instances  of  fc-SAT  if 
n/r  <  (.46  *  2 k/(k  +  l))((fc  -  l)/(fc  -  2))fc~2  -  1  and  4  <  fc  <  40  and  if  40  <  fc  and 
n/r  <  1010  (for  practical  purposes  this  is  all  ratios  of  n/r).  The  following  algorithm  is 
an  improvement  to  the  unit-clause  algorithm:  repeatedly  assign  a  value  which  satisfies 
a  unit  clause  or,  if  no  unit-clauses  exist,  a  value  which  satisfies  most  remaining  clauses 
(instead  of  assigning  values  arbitrarily).  In  [4]  it  is  shown  that  this  improvement  results  in 
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an  algorithm  that  finds  a  satisfying  truth  assignment  for  a  random  instance  of  3-SAT  with 
bounded  probability  under  M(ra,r,  3)  when  n/r  <  2.9.  Similar  results  have  been  obtained 
for  other  instance  distributions  (see,  for  example,  [9],  [13]  and  [15]). 

Although  the  algorithms  of  the  previous  paragraph  involve  no  backtracking,  the  heuris¬ 
tics  for  choosing  variables  and  values  presented  in  those  algorithms  can  easily  be  incor¬ 
porated  into  a  backtrack  structure.  In  fact,  the  Davis-Putnam  Procedure  (DPP),  given 
in  [6]  and  [7],  includes  the  unit-clause  heuristic.  Thus,  the  results  mentioned  in  the  pre¬ 
vious  paragraph  apply  directly  to  DPP  in  the  case  that  it  stops  when  a  satisfying  truth 
assignment  is  obtained.  The  heuristics  employed  by  DPP  are  important  since  without 
them  DPP  would  almost  always  require  time  exponential  in  r  for  any  fixed  ratio  of  n  to  r 
([10],  [11]).  In  this  paper  we  investigate  how  important  these  and  other  heuristics  are  to 
backtracking  when  we  are  interested  in  verifying  that  no  satisfying  truth  assignment  exists 
for  an  unsatisfiable  instance  of  fc-SAT.  The  class  of  heuristics  we  consider,  when  applied 
to  simple  backtracking,  produces  the  class  of  problem  solving  procedures  known  as  Search 
Rearrangement  Backtracking  (SRB)  and  discussed  in  [2],  [14],  [16]  and  [17]. 

We  present  SRB  as  an  algorithm  in  which  clauses  are  represented  as  sets  of  literals 
and  instances  are  represented  as  collections  of  clauses.  In  this  representation,  under  a 
partial  assignment  of  values  to  variables,  a  false  literal  is  removed  from  a  clause  and  a 
true  clause  is  removed  from  an  instance.  Let  H(I)  be  a  function  that  maps  instances 
I  of  SAT  to  boolean  variables  in  I.  The  H  function  represents  a  wide  class  of  heuristics 
for  dynamically  choosing  the  next  variable  to  eliminate  in  a  backtrack  search.  We  use  the 
convention  that  the  positive  literal  associated  with  variable  v  is  denoted  v  and  the  negative 
literal  associated  with  variable  v  is  denoted  v.  Search  Rearrangement  Backtracking  applied 
to  I  is 

SRB(7): 

If  7  has  a  null  clause  then  return  “UNSAT” 

Else  if  7  is  empty  then  return  “SAT” 

Else 

v  H(I) 

h  <—  {c  -  {v}  :  c  G  7,  v  £  c} 
h  <—  {c  -  {v}  :  c  G  I,v  £  c} 

If  SRB(/!)=“UNSAT”  and  SRB(72)=“UNSAT”  then  return  “UNSAT” 

Else  return  “SAT” 

In  SRB,  7 1  is  the  subinstance  of  SAT  obtained  from  7  by  assigning  the  value  true  to  variable 
v  and  72  is  the  subinstance  obtained  by  assigning  the  value  false  to  v.  Although  it  is 
not  necessary  to  do  so,  we  have  restricted  H  by  forbidding  it  to  choose  the  same  variable 
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twice  in  the  same  branch  of  a  backtrack  search  tree.  A  computation  in  which  a  variable  is 
selected  twice  in  the  same  branch  can  always  be  transformed  to  a  shorter  computation  in 
which  no  variable  is  selected  twice  in  the  same  branch.  Therefore  our  lower  bounds  using 
restricted  H  functions  apply  to  all  H  functions.  Our  H  function  is  such  that  SRB  does 
not  include  algorithms  where  the  choice  of  v  is  determined  randomly.  However,  there  is  a 
best  77(7)  for  every  7  (which  minimizes  the  time  to  develop  a  refutation  for  7)  and  this  H 
will  perform  no  worse  than  any  randomized  method  for  choosing  v.  Therefore,  our  result 
provides  a  lower  bound  for  randomized  methods. 

Note  that  SRB  is  actually  the  class  of  all  backtracking  algorithms  for  instances  of  SAT 
which  invoke  backup  when  some  clause  has  become  falsified  or  a  satisfying  truth  assignment 
has  been  found.  Bach  algorithm  in  the  class  is  distinguished  by  its  H  function  which  may 
cause  dynamic  or  static  variable  elimination,  and  need  only  return  a  value  in  finite  time.  An 
interesting  class  of  backtracking  algorithms,  known  as  Multi-Level  Search  Rearrangement 
Backtracking  algorithms,  were  the  inspiration  for  this  paper.  These  algorithms,  as  analysed 
in  [2],  [14],  [16]  and  [17],  for  example,  fall  into  the  class  SRB  if  the  computational  effort 
required  to  evaluate  potential  variable  eliminations  is  allowed  to  show  up  as  part  of  the 
search  tree.  The  H  functions  associated  with  these  algorithms  are  fairly  complicated  and 
involve  looking  many  levels  deeper  into  the  search  tree  to  pick  the  next  variable  elimination 
which  is  most  likely  to  result  in  a  small  subtree.  The  results  of  [2]  and  [17]  are  that 
the  H  function  has  a  significant  impact  on  search  tree  size.  For  example,  in  [17],  3-SAT 
instances  containing  4096  clauses  composed  from  256  variables  (n/r  =  16)  were  solved  by 
Two- Level  Search  Rearrangement  Backtracking  using  10-17  as  many  search  tree  nodes,  on 
the  average,  as  ordinary  backtracking. 


The  most  important  previous  work  on  average  case  analysis  of  backtracking  algorithms 
for  the  Satisfiability  problem  using  the  model  M(n,r,k)  appears  in  [3]  and  [16].  The 
algorithms  of  both  papers  search  for  all  solutions  to  a  given  input.  In  [3]  it  is  shown  that, 


under  A/(n,  r,  k ),  if  n/r 


—  *.a-l 


where  1  <  a  <  k,  a  constant,  ihen  ordinary  backtrack  trees 


contain  at  least  e0(r<  )/(  ^  nodes  on  the  average.  Recall  that,  under  the  condition  of  the 


previous  sentence,  almost  all  inputs  have  no  solutions.  Therefore,  the  result  of  [3]  implies 
that  ordinary  backtracking  requires  exponential  average  time  to  verify  unsatisfiability  if 
n/r  =  ra_1,  1  <  a  <  k,  but  the  exponential  is  sublinear  and  decreases  with  a  until 
a  =  k.  When  a  >  k  (that  is,  n/r  >  r*-1)  then  results  in  [3]  imply  that  ordinary 
backtracking  requires  polynomial  average  time.  In  [16]  the  same  kind  of  results  are  obtained 
for  Simple  Search  Rearrangement  Backtracking.  In  particular,  over  the  range  1  <  a  <  Jfc--1, 
Simple-Search-Rearrangment-Backtrack  trees  contain  at  least  ee*r  *  °  l,/(k  nodes  on 


the  average.  Thus,  Simple  Search  Rearrangement  Backtracking  has  exponentially  better 
average  case  performance  than  ordinary  backtracking  if  n/r  =  ra_1,  1  <  a  <  k  —  1. 
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The  algorithms  of  [3]  and  [16]  may  be  regarded  as  specific  forms  of  SRB  (that  is,  specific 
H  functions)  if  the  “look-ahead”  effort  is  taken  into  account.  This  paper  shows  that  no 
matter  how  clever  the  H  function,  even  if  it  is  vastly  improved  over  another  H  function, 
it  will  not  be  clever  enough  to  yield  polynomial  average  time  on  almost  all  unsatisfiable 
instances  of  fc-SAT  if  n/r(n)  =  o(ra1/lnln(rO)  and  n/r(n)  >  ln(2)/(— ln(l  —  2-fc))  for  all 
n  >  0. 

We  are  interested  in  the  performance  of  SRB  when  inputs  are  almost  always  unsatisfi¬ 
able;  that  is,  when  inputs  are  generated  according  to  M(n,r,  k)  and  n/r  >  ln(2)/(—  ln(l  — 
2-fc)).  We  prove  that  all  algorithms  in  the  class  SRB  require  time  exponential  in  r  almost 
always  when  n/r  =  A  where  A  is  fixed  and  is  greater  than  ln(2)/(—  ln(l  —  2-*)).  The  proof 
itself  is  interesting  because  it  relies  on  a  structural  property  of  instances  of  Jb-SAT  which 
must  be  present  in  almost  all  random  instances  and  cannot  be  present  if  the  search  tree 
corresponding  to  the  execution  of  SRB  on  unsatisfiable  instances  is  small.  The  property, 
loosely  speaking,  is  that  the  number  of  pairs  of  clauses  containing  literals  associated  with 
the  same  variable  is  small  if  n/r  =  A  for  any  fixed  A.  Although  unsatisfiable  instances  of 
fc-SAT  are  generated  under  M(n,r,k)  when  n/r  <  ln(2)/(— ln(l  —  2-*)),  we  are  unable 
to  use  the  property  mentioned  to  extend  the  results  to  that  range  because  almost  all  in¬ 
stances  in  that  range  are  satisfiable  (so  it  is  possible  that  almost  all  instances  have  the 
property  but  almost  no  unsatisfiable  instances  do).  We  also  show  that  SRB  requires  su¬ 
perpolynomial  time  if  n/r  =  o(n1/Ulll,(n))  and  limn— n/r  =  oo. 

We  also  show  that  the  same  result  applies  to  any  form  of  DPP.  DPP  looks  for  unit- 
clauses  and  pure- literals.  For  our  purposes,  a  pure-literal  is  a  variable  which  appears  only 
as  a  positive  literal  or  as  a  negative  literal  in  I.  DPP  is  like  SRB  in  the  sense  that  there 
is  some  heuristic  function  H  which  selects  the  next  variable  to  be  assigned  a  value.  The 
heuristic  function  of  DPP  selects  a  pure-literal  next  or,  if  no  pure-literals  are  present,  a 
unit-clause  next  or,  if  no  unit-clauses  are  present,  a  variable  with  highest  “weight”.  DPP 
differs  from  SRB  in  that  either  I\  or  Ij  but  not  both  is  used  as  a  recursive  argument  to 
DPP  when  the  selected  variable  is  a  pure-literal  in  I.  It  is  this  feature  of  DPP  that  prevents 
us  from  directly  applying  the  results  to  DPP.  However,  we  will  show  how  to  design  an  SRB 
algorithm  from  a  given  DPP  algorithm  which  runs  faster  (generates  fewer  nodes)  than 
the  DPP  algorithm  if  the  given  instance  is  unsatisfiable.  This  implies  that  the  result  also 
holds  for  DPP. 

The  results  we  get  are  pessimistic  and  are  possibly  surprizing  to  those  familiar  with 
a  result  of  Purdom  [14]  which  states  that  even  ordinary  backtracking  can  verify  unsatis¬ 
fiability  in  polynomial  time,  on  the  average,  for  a  variety  of  relationships  between  model 
parameters.  Purdom’s  random  clause  model  and  model  M  have  certain  similarities.  In 
both  models  n  clauses  are  independently  constructed  from  r  boolean  variables.  But,  in  the 
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random  clause  model,  instead  of  a  fixed  number  of  literals  per  clause,  each  literal  appears 
independently  in  a  clause  with  probability  p.  Thus,  if  we  set  2pr  =  fc,  clauses  have  k  liter¬ 
als,  on  the  average,  as  for  model  M.  Also,  if  n/r  >  ln(2)/(  —  ln(l  —  (1  —  p)r))  then  almost  all 
instances  are  unsatisfiable.  If  2pr  =  k  this  condition  is  nearly  n/r  >  ln(2)/(—  ln(l  —  e~k/2)) 
which  is  similar  to  the  condition  that  almost  all  instances  are  unsatisfiable  in  model  M. 
The  results  of  Purdom’s  work  in  this  area  are  that  various  backtracking  algorithms  exhibit 
very  different  average  case  behavior  depending  on  the  values  given  to  parameters.  Pur¬ 
dom’s  results  are  interesting  (and  parallel  our  own  results)  because  they  show  that  these 
algorithms  are  fast  on  the  average  if  instances  are  usually  “very”  unsatisfiable,  are  slow 
if  instances  are  “moderately”  unsatisfiable  or  “moderately”  satisfiable  (not  too  many  lit¬ 
eral  links  between  clauses),  and  fast  if  instances  are  “highly”  satisfiable.  Model  M  also 
has  this  property.  When  n/r(n)  >  (r(n))*-1,  or  n/r(n)  <  In (n)/n  there  are  SRB  algo¬ 
rithms  that  almost  always  solve  problems  in  polynomial  time  (see  [3]  and  [15]).  This  paper 
is  concerned  with  the  range  ln(2)/(— ln(l  —  2-fc))  <  n/r  <  o(n1//lnlll^n^),  which  is  in  the 
intermediate  region  for  model  M. 

A  result  of  [14]  is  that  if  2 pr  =  k,  k  fixed,  then  for  n/r  big  enough,  ordinary  back¬ 
tracking  requires  polynomial  time,  on  the  average.  This  result  appears  to  be  strikingly 
different  from  ours  but  can  be  accounted  for  in  the  following  way.  Under  the  random 
clause  model,  the  probability  that  a  random  instance  contains  a  null  clause  (no  literals)  is 
1  —  (1  —  (1  —  p)2r)n  which  is  1  —  (1  —  e~k)n  in  the  limit  if  2 pr  —  k.  But,  if  a  null  clause 
appears  in  the  given  instance,  backtracking  stops  and  states  the  given  instance  is  unsatisfi¬ 
able  without  doing  any  searching  at  all.  The  time  required  by  backtracking  in  that  case  is 
the  time  to  locate  a  null  clause  which  is  0(n )  at  worst.  If  the  time  required  by  backtrack¬ 
ing  to  verify  the  unsatisfiability  of  instances  that  do  not  originally  contain  a  null  clause 
is  (1  —  e~k)~n  (that  is,  exponential  in  n)  then  the  average  time  required  by  backtracking 
is  0(n)  •  (1  —  (1  —  e~fc)n)  -f  (1  —  e-*)n(l  —  e~k)~n  =  0(n).  Thus  it  is  possible  that  all 
or  nearly  all  random  instances  with  no  null  clauses  which  are  generated  under  the  ran¬ 
dom  clause  model  with  2pr  =  k  are  solved  in  exponential  time  by  backtracking  and  yet 
the  average  time  for  backtracking  is  polynomial  in  n.  Model  M  does  not  generate  any  null 
clauses  so  our  result  is  not  inconsistent  with  polynomial  average  time  under  the  random 
clause  model  not  only  for  ordinary  backtracking  but  for  more  sophisticated  forms  of  back¬ 
tracking  such  as  the  algorithm  in  [17].  Another  way  to  look  at  the  relationship  between 
our  result  and  the  results  under  the  random  clause  model  is  to  regard  model  M  as  gen¬ 
erating  a  very  small  and  non-easy  subset  of  the  instances  that  the  random  clause  model 
generates. 


2.  Analysis 

We  use  a  binary  tree,  denoted  Ti(H),  to  model  the  execution  of  SRB  for  a  particular  H 
function  on  a  given  instance  I  of  k-SAT  in  the  customary  manner.  Associated  with  each 
non-leaf  x  in  Tj(H)  is  a  boolean  variable  v(x)  contained  in  I  and  a  subinstance  I(x)  of 
I.  Associated  with  the  edge  connecting  x  to  its  left  (right)  child  is  the  interpretation  that 
v(x)  is  assigned  the  value  true  (false),  respectively.  Associated  with  a  path  from  the 
root  of  Ti(H )  to  any  node  x  is  the  partial  assignment  P(x )  of  values  to  the  variables 
corresponding  to  nodes  visited  on  that  path  except  for  x.  Specifically,  P(x)  is  a  set  of 
variable/assignment  pairs  ( v  <—  t),  one  pair  for  each  v  associated  with  a  node  on  the  path 
from  the  root  down  to  but  not  including  x,  where  t  is  true  (false)  if  the  left  (right)  son 
of  the  node  associated  with  v  is  on  the  path  to  x.  We  defer  a  discussion  on  the  meaning 
and  determination  of  I(x)  until  we  develop  some  intuition  about  Ti(H). 

Associated  with  leaves  and  edges  of  Tj(H)  are  labels  corresponding  to  clauses  in  I. 
Let  each  clause  in  I  be  given  a  name  that  uniquely  identifies  that  clause.  We  label  every 
leaf  x  of  Ti(H)  with  the  name  of  the  clause  that  is  null  under  P(x),  if  at  least  one  clause 
is  null  under  P(x).  If  more  than  one  clause  is  null  under  P(x)  then  x  is  labeled  with  the 
name  of  one  of  the  null  clauses  arbitrarily.  If  no  clauses  are  null  under  P(x)  then  x  is  given 
no  label.  We  associate  with  each  edge  of  Ti(H)  a  set  of  literal/clause-label  tuples  (called 
edge  labels)  as  follows.  Let  x  be  a  non-leaf  of  T/(H)  with  left  child  y  and  right  child  z. 
Let  (x,y)  be  the  edge  connecting  x  to  y,  let  (x,z)  be  the  edge  connecting  x  to  z.  Let  v(x) 
be  the  variable  associated  with  x.  If  /  is  a  label  given  to  a  leaf  of  the  subtree  of  Ti(H) 
having  y  (alternatively  z)  as  root,  and  the  literal  v(x)  (alternatively  u(x))  is  in  the  clause 
corresponding  to  l  then  (v(x),l)  (alternatively  (v(x),l)  )  is  a  member  of  the  list  of  labels 
associated  with  (x,y)  (alternatively  (x,z)).  If  /  is  in  an  edge  label  associated  with  edge  e 
then,  for  brevity,  we  say  l  is  associated  with  e. 

Figure  1  contains  an  example  of  the  labeling  of  leaves,  association  of  labels  to  edges, 
and  the  association  of  variables  and  subinstances  to  nodes  of  a  subtree  of  Ti(H)  rooted  at 
i  given  the  instance 

I  =  (V1,V2,V4)(V3,V5,V6)(V2,V3,V4)(V2,VS,V6)(V1,V2,V3) 

and  some  H  function  where  x  is  such  that  P( x)  —  {(vs  <—  false), (vq  <—  false)}.  To 
simplify  the  figure  we  have  shown  only  the  clause  labels  associated  with  edges:  the  literals 
of  the  edge  labels  that  are  actually  associated  with  edges  are  implied.  If  x  is  a  leaf  of 
Ti(H)  and  l  labels  x,  then  the  label  l  is  associated  with  exactly  k  edges  on  the  path  from 
the  root  of  T[(H)  to  x  (namely  those  edges  which  represent  the  partial  truth  assignment 
which  falsifies  all  k  literals  in  the  clause  labeled  l).  If  x  is  not  a  leaf  and  has  left  child 
y  and  right  child  z  and  the  edge  label  (v(x),l)  is  associated  with  the  edge  (x,y)  then  / 
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cannot  be  associated  with  any  edge  in  the  subtree  of  Ti(H)  rooted  at  z  for  the  following 
reasons:  (1)  v(x )  cannot  be  associated  with  any  node  below  and  including  z,  (2)  the  clause 
labeled  /  must  contain  v(z),  and  (3)  complementary  literals  are  not  allowed  in  the  same 
clause  (so  ( v(x),l )  cannot  be  associated  with  (z,z)).  Consequently,  l  cannot  be  associated 
with  any  leaf  under  z.  Similarly,  if  the  edge  label  ( v(z),Z )  is  associated  with  (z,z)  then  l 
cannot  be  associated  with  any  edge  in  the  subtree  of  Ti(H )  rooted  at  y  and  cannot  label 
a  leaf  under  y. 

We  now  define  I(x),  a  subinstance  of  I  associated  with  node  x.  I(x)  is  the  subset  of 
clauses  in  I  that  label  leaves  below  x.  Note  that  even  if  a  clause  labels  many  leaves  below 
x  it  appears  only  once  in  J(z).  Also  note  that  clauses  in  J(z )  have  exactly  k  literals. 

In  this  paper  we  are  concerned  with  verifying  unsatisfiability.  If  an  instance  is  unsat' 
isfiable  then  Ti(H)  is  a  refutation  tree  and  has  the  property  that  all  its  leaves  are  labeled 
since  backtracking  is  only  due  to  the  emergence  of  a  null  clause.  From  now  on  it  will  be 
understood  that  Tj(H)  is  a  refutation  tree.  We  make  the  following  simplifying  assumption: 

Assumption  S: 

At  least  one  label  is  associated  with  every  edge  of  Ti(H). 

If  there  exists  an  edge  ( x,y }  with  no  associated  edge  labels  then  a  computation  involving 
fewer  nodes  is  possible:  simply  replace  the  subtree  rooted  at  x  with  the  subtree  rooted  at 
y.  Thus,  lower  bounds  derived  from  the  simplifying  assumption  will  apply  to  all  search 
strategies.  The  effect  of  this  assumption  is  to  allow  backtracking  using  the  pure-literal 
rule.  We  will  see  this  more  clearly  at  the  end  of  the  next  section  when  we  consider  the 
Davis-Putnam  Procedure. 

Each  clause  label  associated  with  a  leaf  of  Tj(H )  must  also  be  in  edge  labels  associated 
with  k  edges  on  the  path  from  the  root  of  Tj{H)  to  that  leaf  or  else  some  clause  labeling 
a  leaf  is  not  falsified  by  the  truth  assignment  associated  with  that  leaf.  Furthermore,  if  a 
clause  labels  more  than  one  leaf  then  the  k  edges  on  each  path  from  the  root  to  such  a  leaf 
must  all  be  associated  with  the  same  k  variables  and  assignments  (namely  those  that  make 
the  clause  false).  Thus,  for  any  node  x  in  a  refutation  tree  such  that  p(x)  distinct  leaf 
labels  exist  below  z,  the  number  of  distinct  edge  labels  in  Ti(H )  due  to  those  leaf  labels  is 
kp(x).  For  every  node  z  let  c(x)  be  the  number  of  distinct  edge  labels  at  or  below  z  and 
define  h(x)  =  kp(x)  —  c(z),  the  number  of  edge  labels  above  i  which  are  due  to  leaf  labels 
below  z.  Clearly,  if  h(x)  >  0  then  z  cannot  be  the  root  of  a  refutation  tree.  In  Figure  1, 
h(x)  =  3*5  —  11  =  4  with  specific  contributions  from  (^5,02),  (vg, C2),  (vj,  C4)  and  (vg,  C4) 
which  are  the  edge  labels  above  x  that  are  due  to  the  clauses  labeling  leaves  below  z.  We 
will  show  that  h(x)  >  0  in  a  small  tree,  with  probability  tending  to  1.  This  will  be  used 
to  show  that  Tj(H )  cannot  be  small,  with  probability  tending  to  1. 
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We  also  introduce  a  function  common(x).  Let  x  be  a  node  in  Tj(H),  let  c(x)  be  the 
number  of  distinct  edge  labels  at  or  below  x,  and  let  var(x)  be  the  number  of  distinct 
variables  associated  with  nodes  at  or  below  x.  We  define  common(x)  =  c(x)  —  var(x).  For 
example,  in  Figure  1,  c(x)  =  11  and  common(x)  =  7. 

Finally,  we  define  a  function  Icomm(I').  Let  I'  be  any  collection  of  clauses.  Then 
Icomm(I')  is  the  total  number  of  literals  in  I'  that  are  associated  with  variables  that  appear 
two  or  more  times  in  I'  minus  the  number  of  distinct  variables  of  that  kind.  Icomm(I')  is 
also  the  total  number  of  literals  in  I'  minus  the  number  of  distinct  variables  in  I'  since  there 
is  exactly  one  literal  in  I'  for  every  variable  that  appears  once  in  I'.  Note  the  similarity 
between  Icomm  and  common(x).  However,  Icomm  is  defined  for  any  collection  of  clauses 
and  not  just  those  sets  of  clauses  corresponding  to  7(x)  where  a:  is  a  node  of  Tj(H).  We 
even  allow  Icomm(I')  to  be  defined  if  one  or  more  clauses  in  I'  contain  duplicate  or 
complementary  literals.  We  make  use  of  Icomm  to  bound  common(x)  in  the  following 
way.  Let  I  be  any  instance  of  fc-SAT.  For  any  H  function,  let  i  be  a  node  in  7j(if)  and 
suppose  I(x)  contains  p(x)  clauses  and  common(x)  >  p(x)(l  4-  l/ln(p(x))).  Then  there 
is  a  subset  I'  of  I  containing  p(x)  clauses  such  that  Icomm(I')  >  p(x)(l  +  l/ln(p(x))), 
namely  all  the  clauses  of  I(x).  Hence  we  may  make  the  following 
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Observation  Is 

Suppose  that  no  subset  I'  of  clauses  in  an  instance  I  of  fc-SAT  which  contains  p  clauses 
is  such  that  Icomm(I')  >  p(l  4-  l/ln(p)).  Then,  for  all  H  functions,  there  does  not 
exist  a  node  x  in  Tj(H)  such  that  |/(x)|  =  p  and  common(x)  >  p(l  +  l/ln(p))  where 
|/(x)j  is  the  number  of  clauses  in  I(x). 

We  will  use  observation  1  to  show  that  if  n/r(n)  doesn’t  grow  too  fast  then,  with 
prrbability  tending  to  1,  common(x)  is  small  for  every  node  x  such  that  the  number  of 
distinct  clause  labels  below  x  is  0(nf^n^),  where  e(n)  is  not  very  small  asymptotically.  The 
following  theorem  and  corollary  state  this  more  precisely. 

Theorem  1: 


Let  I  be  an  instance  of  Jc-SAT  generated  according  to  M(n,r,k).  Let  u;(n)  be  any 
function  that  decreases  asymptotically  to  0  and  is  such  that  limn  ^n“(n)  =  oo. 
Suppose  that  n/r(n)  =  A(n)  obeys  A(n)  =  o(nw(n))  and  A(n)  >  ln(2)/(-ln(l  -  2~k)) 
for  all  n  >  0.  Let  e(n)  =  l/(ln(e2&fc+2A(n))  +  3).  Then  the  probability  that  there 
exists  a  subset  I'  of  I  with  p  <  n<(n^  clauses  and  such  that  Icomm(I')  is  greater  than 
p(l  +  l/ln(p))  tends  to  0  as  n  tends  to  infinity. 


Proof: 


In  the  rest  of  this  proof  and  in  the  remaining  proofs  and  discussion  in  this  paper  we 


use  e  for  c(n)  and  A  for  \(n)  to  avoid  clutter.  The  probability  that  there  exists  a  subset 
/'  of  I  containing  p  clauses  such  that  Icomm(I')  >  a  is  less  than  the  average  number 
of  such  subsets.  The  average  number  of  such  subsets  is  the  sum  of  the  probabilities 
that  each  p  clause  subset  /'  of  I  has  Icomm(I')  >  a.  This  is  (£)  times  the  probability 
that  Icomm(I')  >  a  where  I1  is  a  random  p-clause  subset  of  I.  The  probability  that 
Icomm(I')  =  i  is  the  number  of  ways  to  construct  I'  such  that  Icomm(I')  =  i  divided 
by  (2*(£))p,  the  number  of  possible  p-clause  subsets  of  I.  The  number  of  ways  to 
construct  /'  such  that  Icomm(I')  =  i  is  less  than  the  number  of  ways  to  construct  I' 
such  that  Icomm(I')  =  i  if  clauses  were  allowed  to  have  duplicate  or  complementary 
literals.  But  the  number  of  ways  to  construct  I'  such  that  Icomm(I')  =  i  and  I' 
is  allowed  to  have  duplicate  or  complementary  literals  in  the  same  clause  is  2fcp,  the 
number  of  ways  to  assign  positive  and  negative  literal  values  to  kp  literals,  times  the 
number  of  ways  to  partition  kp  literal  place-holders  into  kp  —  i  variable  groups  with 
labels  taken -from  1  to  r.  Using  braces  to  denote  Stirling  numbers  of  the  second  kind, 
the  latter  number  is  !  (see  [18],  pages  133  and  134  for  a  detailed 

explanation  of  this  quantity).  It  can  easily  be  shown  that  ([)  >  (r/k)k  for  all  integers 
r  >  k  >  0.  Furthermore,  from  the  appendix,  <  ( kp)2i/i\ .  Then,  the  average 

number  of  subsets  of  I  containing  p  clauses  and  such  that  Icomm(I')  >  a  is  less  than 


Defining  a  new  variable  j  =  i  —  p,  and  bringing  (£)  into  the  summand,  we  see  that 
(1)  is  equal  to 

yf  {kpY^+^n\kkP 

(j  -f-  p)\p\(n  —  p)!rD+p) 


j=a-P  27r\/0  +P)p(n  ~P)((j  +  P)/€)ij+p)(p/e)p((n  -p)/e)("-pM>+P) 


by  Stirling’s  formula  for  factorials  (that  is,  z!  =  \/2nx{x / e)1  ee^2x ,x  >  0,0  <  9  <  1). 
Rearranging  terms  and  noticing  that  ei^l2n/n  <  1  since  n  >  2,  (2)  can  be  bounded 


from  above  by 

j=a—p  P 


n 


ripP(j  +  p)U+p)  y  4 (j  +  p)p(n  -  p) 

<  *£?  ,ny  /  n  \(n~p)  {k^p)2^kkp 

~  vr/  yn  —  p J  rJpp(i  +  p)(j+p) 


J=a-p 


since  n  >  2,  €  <  1/3  (and  therefore  p  <  n1/3),  and  p  >  1  implies 


n 


< 


V  4(;  +p)p(n-p)  Y  40  +  ?)p(1  -  n_2/3)  1.2^  +  p)p 


<  1 


(actually  we  could  derive  a  similar  upper  bound  for  e  <  1  but  it  is  unnecessary  to  do 
so  to  get  our  main  result).  By  making  use  of  the  fact  that  (1  —  p/n)^n~p^  >  e~p  if  n 
and  p  are  positive  and  using  n/r  =  \  we  can  make  the  last  sum 


<  V  (e2ibt+2A)p(eJk2)Vp2> 

2^  rJ(j  +p)p(j  +p)i 


(3) 


j=a-p 


Suppose  a  =  p  +  p/ln(p)  (that  is,  j  >  p/ln(p)).  Then  for  sufficiently  large  n 

ek2p 2  ek2p 


<  1/2 


r(j  +  p)  r 

since  A  =  o(nw ("))  and  p  <  n*  where  e  <  1/3.  Furthermore, 


^(j+P)^J>  <  ^2jfc<!+2A^  =  (e2*fc+2A)ln(p)('>/ln(p)). 


Therefore,  the  summand  of  (3)  is  less  than 


xyi-p/ln(p))  ^pIn(e’fc*  +  ,A)efc2py/ln(p) 


(I) 


for  sufficiently  large  n.  Hence  the  sum  (1)  is  less  than 


2  ^pIn(^fcfc+>A)efc2pj 


p/ln(p) 


for  sufficiently  large  n  if  a  =  p(l  +  l/ln(p)).  Thus,  the  probability  that  there  exists  a 
subset  I1  of  I  containing  n(  or  fewer  clauses  such  that  Icomm(I')  >  p(l  +  l/ln(p))  is 
less  than 


p|n(«*fc‘  +  »A)ejb2p' 


P= 2 


p/\n(p) 


(4) 


E 


The  derivative  of  the  summand  of  (4)  with  respect  to  p  is 


^ln(e2Jkfc+2A) 


+  1  + 


ln(efc2/r) 

ln(p) 


p(ln(«,*‘+,A)+ \)ek7 


p/ln(p) 


For  sufficiently  large  r,  ln(efc2/r)  is  a  negative  number.  Hence,  for  sufficiently  large 
r,  ln(ek2/r)(l  —  1/ ln(p))/ ln(p)  becomes  more  positive  as  p  increases  when  p  >  2. 
Furthermore, 


p(in(«**t+1A)+i)eJb2 


p/ln(p) 


increases  as  p  increases,  p  >  1.  Therefore,  the  derivative  of  the  summand  of  (4)  is 
monotonically  increasing  with  P,  P  >  2,  and  is  maximum  at  either  p  =  2,  p  =  3  or 
p  =  n*.  At  p  =  2,  it  is  straightforward  to  check  that  for  any  e  <  1/3 


2(in(e1fc*+,A)+i)ej.2 


2/ln(2) 


for  large  n  if  A  =  o(n“^n^)  and  k  is  fixed.  Similarly  for  p  =  3.  At  p 
e  =  l/(ln(e2A;,r+2A)  +  3),  we  have 


=  n*,  where 


n(ln(e5*fc  +  ’A)  +  l)«eife2 


n‘/«  ln(n) 


-  2 

<2(n <n-u 

for  large  enough  n  if  A  =  o(n"^n^).  Since  the  summand  of  (4)  has  a  maximum  less  than 
n-2f,  the  sum  (4)  is  less  than  n~f.  But  l/ln(n*)  =  (u;(n)ln(n)  +  ln(e2fcfc+2)-f3)/ln(n) 
which  tends  to  zero  as  n  gets  large  so  n~ f  tends  to  zero.  This  proves  the  theorem. 

Corollary  1: 

Let  I  be  an  instance  of  A: -SAT  generated  according  to  M(n,r,k).  Let  u»(n)  be  any 
function  that  decreases  asymptotically  to  0  and  is  such  that  limn_oo  nu^n'>  =  oo. 
Suppose  that  n/r(n)  =  A(n)  obeys  A(n)  =  o(nw^)  and  A(n)  >  ln(2)/(—  ln(l  —  2~fc)) 
for  all  n  >  0.  Then  the  following  statement  is  true  with  probability  tending  to  1.  For 
all  H  and  all  nodes  x  in  T;(//)  such  that  the  number  of  distinct  clause  labels  below 
x  is  p  <  k  +  common(x)  <  p(l  +  l/ln(p)). 


Proof: 
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Follows  from  Theorem  1,  observation  1  and  the  fact  that  almost  all  instances  are 

unsatisfiable  if  n/r  >  ln(2)/(—  ln(l  —  2-fc)). 

For  the  sake  of  simplicity  we  drop  the  subscript  from  Tj{H)  in  what  follows.  Corollary 
1  gives  a  property  that  any  search  rearrangement  backtrack  tree  has  with  probability 
tending  to  1  if  instances  are  generated  according  to  M(n,r,  k).  In  Lemma  1  we  show 
that  h(x)  >  kp{x )  —  2  *  crmmon(x)  for  all  nodes  x  in  T(H).  This  and  the  fact  that 
k  >  3  means  that,  with  probability  tending  to  1,  h(x)  >  p(x)(l  —  2/ln(p))  for  any  node 
x  that  is  the  root  of  a  subtree  containing  p(x)  <  n* ,  e  =  l/(ln(e2fcfc+2A)  •+•  3),  distinct 
clause  labels.  In  Lemma  2  we  derive  an  important  property  shared  by  almost  all  random 
graphs  we  consider.  The  property  is  that  no  variable  appears  in  more  than  (ln(n)  +  l)fcA 
clauses.  In  Lemma  3  this  property  is  used  to  show  that,  with  probability  tending  to  1, 
h(x)  >  h(y)  —  (ln(n)  +  l)k\  where  a:  is  a  node  in  T(H)  which  is  the  parent  of  y  (that  is,  the 
h  function  cannot  decrease  by  more  than  (In(n)-f  l)fcA  per  node  as  we  move  toward  the  root 
of  T{H)).  So,  with  probability  tending  to  1,  for  any  node  x  in  T(H )  such  that  h(x)  >  L,  the 
number  of  nodes  on  the  path  from  x  to  the  root  of  T(H)  is  at  least  L/(ln(n)  +  l)fcA 
(Theorem  3).  In  Theorem  4  we  show  that,  with  probability  tending  to  1,  there  exists 
a  node  x  such  that  tW2  >  p(x)  >  nf/2/ 2.  This  means  there  is  at  least  one  node  x  in 
T(H )  such  that  h(x)  >  n^2(  1  —  4/  ln(n*))  and  that  the  number  of  nodes  on  the  path  from 
that  node  to  the  root  is  tv*/2(1  —  4/ln(n<))/(ln(n)  +  l)fcA.  W>  slice  off  fill  nodes  in  T{H) 
that  are  deeper  than  W2(  1  —  4/ln(n*))/(ln(n)  +  l)fcA  and  call  each  node  at  that  depth  a 
bottomnode.  In  Theorem  5  we  show  that,  with  probability  tending  to  1,  on  the  path  from 
all  bottomnodes  to  the  root  there  are  at  least  2n*/4(l  —  4/ ln(n£))/(ln(n)  +  l)fcA  nodes 
for  which  both  children  have  bottomnodes  as  descendants.  This  implies,  with  probability 
tending  to  1,  an  exponential  treesize  for  T(H )  if  A  is  fixed  and  superpolynomial  treesize  if 
o(n1/lnln^n^)  =  A(n)  and  limn-,^  A(n)  =  oo  (Theorems  6,  7  and  8). 

First  we  derive  some  relationships  between  h(x),  common(x)  and  var(x). 

Theorem  2: 

For  all  H  and  x  in  T(H),  h(x)  =  kp(x)  —  common(x)  —  var(x). 

Proof: 

Recall  that  common(x)  4-  var(x)  is  the  number  of  distinct  edge  labels  at  or  below  x. 

The  rest  follows  from  the  definition  of  h(x). 

Theorem  2  leads  to  the  following  useful  relationship  between  h(x)  and  common(x). 

Lemma  1: 

For  any  H  function  and  x  in  T(II),  h(x)  >  kp(x)  —  2*  common(x). 
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Proof: 


Since  every  variable  below  x  is  associated  with  at  least  two  distinct  edge  labels,  c(x)  > 
2  *  var(x).  Therefore,  var(x)  <  common(x).  This  and  Theorem  2  imply  h(x)  > 
kp(x)  —  2  *  common(x). 

The  next  two  lemmas  and  theorem  show  that,  for  any  H  function  and  any  node  x  in 
T(H)  such  that  h(x)  >  ne,  the  length  of  the  path  from  root  of  T(H )  to  *  is  great,  with 
probability  tending  to  1,  if  A  =  o(n^n^)  and  A(n)  >  ln(2)/(— ln(l  —  2-fc))  for  till  n  >  0. 


Lemma  2: 

Let  o>(n)  be  any  function  of  n  that  tends  to  0  asymptotically  and  is  such  that 
limn—oo  =  00.  The  probability  that  some  variable  appears  in  more  than  (ln(n)  + 
l)JfeA(n)  clauses  of  an  instance  of  &-SAT  generated  according  to  M(n,r,k )  tends  to 
0  as  n  tends  to  infinity  if  A(ra)  =  o(nu^)  and  A(n)  >  ln(2)/(— ln(l  —  2~fc))  for  all 
n  >  0. 


Proof: 


Let  v  be  a  variable  taken  from  V.  The  average  number  of  clauses  containing  v  is 
kn/r  =  k\.  The  probability  that  v  is  in  at  least  (ln(n)  4-  1)A:A  clauses  is 


E 


i=(ln(n)-M)kA 


<  e-ln*(n)fcA/3 


from  the  Chernoff  bound  for  binomial  distributions  [1]  and  [8].  The  average  number 
of  variable^  that  appear  in  at  least  (ln(n)  4-  l)fcA  clauses  is  therefore 

-1 n\n)kX/3  _  r  _ _ !! _ 

eln(n)ln(n)fcA/3  wln(n)fcA/3 


1 

A  n1"(n)fcA/3-1 


0  as  n 


00. 


Since  the  average  number  of  variables  that  appear  in  at  least  (ln(n)  +  1)&A  clauses 
is  an  upper  bound  on  the  probability  that  there  exists  a  variable  that  appears  in  at 
least  (ln(n)  4-  1)A:A  clauses  the  lemma  is  proved. 

In  what  follows  we  show  that  the  search  tree  for  any  H  must  be  exponentially  large 
if  the  input  has  the  properties  stated  in  Lemma  2  and  Corollary  1. 


Lemma  3: 

Let  w(n)  be  any  function  that  tends  to  zero  asymptotically  and  is  such  that  limn_oo  = 
00.  The  following  statement  is  true  with  probability  tending  to  1.  For  all  H  func¬ 
tions  and  parent  nodes  x  in  T{H)  with  child  y,  h(x)  >  h(y)  -  (ln(n)  4-  l)&A(n)  if 
A(n)  =  o(n“(n>)  and  A(n)  >  ln(2)/(-ln(l  -  2~k ))  for  all  n  >  0. 
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Proof: 

Let  there  be  sy  labels  associated  with  the  edge  connecting  x  with  y  and  sz  labels 
associated  with  the  edge  connecting  x  to  z.  Let  Nyz(i)  denote  the  number  of  clauses 
that  appear  as  edge  labels  i  times  in  the  path  from  y  to  a  leaf  of  T(H)  labeled  by 
such  clauses  and  in  the  path  from  z  to  a  leaf  of  T(H)  labeled  by  such  clauses  (note 
that  the  labels  associated  with  these  clauses  contribute  k  —  i  to  h(z)  but  twice  this  to 
h(y)  +  h(z)).  Therefore, 

k 

h(x)  =  h{y)  4-  h(z)  -  (sy  +  sz)  -  Y](k  -  i)Nyz(i).  (5) 

i=l 

Observe  that  in  equation  (5)  h(z)  —  5^*_j(Jfe  —  0^v*(0  —  Hence  M*)  >  h(y )  - 
(sy  +  az).  But,  from  Lemma  2,  the  probability  that  no  variable  appears  in  more 
than  (ln(n)  +  l)k\  clauses  tends  to  1.  Since  the  variable  associated  with  x  is  in 
clauses  with  labels  associated  with  edges  connecting  x  to  its  children,  we  have  that 
sy  +  a*  <  (ln(n)  +  l)JbA  for  all  H  and  x  in  T(H)  with  probability  tending  to  1.  The 
lemma  follows. 

Theorem  3: 

Let  L(n)  be  any  positive  integer  function  of  n  and  let  w(n)  be  any  function  of  n  that 
tends  to  0  asymptotically  and  is  such  that  limn— „  n"  =  oo.  The  following  statement 
holds  with  probability  tending  to  1.  For  all  H  functions,  the  pathlength  of  any  path 
from  the  root  of  T(H)  to  a  node  x  such  that  h(x)  >  L(n)  is  at  least  L(n)/(ln(n)  +  l)fcA 
if  A(n)  =  o(nu,(n))  and  A(n)  >  ln(2)/(— ln(l  -  2~fc))  for  all  n  >  0. 

Proof: 

Follows  immediately  from  Lemma  3  and  the  fact  that  /i(root(T(/f)))  =  0. 

Theorem  4: 

Let  w(n)  be  any  function  of  n  that  tends  to  0  asymptotically  1  is  such  that 
limn_oo  =  oo.  Let  0  <  7  <  1  be  fixed  and  let  e(n)  —  l/'\,  •»*+2A(n))  +  3). 

The  following  statement  is  true  with  probability  tending  to  1  r ^  all  H  functions  and 

k  >  3,  there  is  at  least  one  subtree  of  T(H)  with  at  1<*'  and  at  most 

distinct  clause  labels  below  its  root  if  A(?t)  =  o(nuln '  onu  A(n)  >  ln(2)/(—  ln(l  —  2-*)) 
for  all  n  >  0. 


Proof: 


From  the  root  of  T(H )  move  toward  »  ay  visiting  nodes  as  follows:  at  each  visited 
node  z,  visit  next  the  child  -T  w’  ,1  has  the  greatest  number  of  distinctly  labeled 
leaves  beneath  it  (decide  ties  ai  utrarily).  Call  the  path  just  traced  P.  Let  x  be 


s-vv-;-.':/ 
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a  node  on  P.  The  number  of  distinctly  labeled  leaves  beneath  the  parent  of  x  is 
no  greater  than  twice  p(x)  because  the  number  of  distinctly  labeled  leaves  beneath 
the  sibling  of  x  is  less  than  p(x)  (otherwise  we  would  have  moved  in  the  direction  of 
the  sibling  on  the  way  down).  Furthermore,  the  number  of  distinctly  labeled  leaves 
beneath  the  parent  of  x  is  at  least  p(x)  +  1  since  the  clause  labeling  the  sibling  of  x 
cannot  be  below  x.  Thus,  if  we  move  up  s  nodes  from  y  we  will  be  at  a  node  which 
has  at  least  s  + 1  and  at  most  2*  distinctly  labeled  leaves  beneath  it.  We  can  certainly 
move  up  P  from  y  as  long  as  the  number  of  distinct  clauses  beneath  the  currently 
visited  node  is  less  than  8  since  at  least  8  clauses  are  required  for  a  refutation  of  k- 
SAT  where  k  >  3.  From  Lemma  1  and  Corollary  1  we  have  that,  for  any  node  x  on 
P  such  that  p(x)  <  ne,  h(x)  >  p(x)(  1  —  2/ln(p(®)),  with  probability  tending  to  1. 
Thus,  if  8  <  p(x)  <  n7f,  0  <  7  <  1,  then  h(x)  >  0  hence  x  is  not  the  root  of  T(H). 
Therefore,  we  can  move  up  P  from  y  to  the  last  node  z  such  that  p(z )  >  n7t/2.  Since 
p(father(z))  can  be  at  most  double  p(z)  we  have  p(z)  <  n7*.  The  node  z  is  the  one 
required  to  prove  the  theorem. 

We  call  a  node  that  is  the  root  of  a  subtree  of  T(H)  containing  between  n*/2/ 2  and 
e  =  l/(ln(e2fct+2A)  +  3),  distinct  clause  labels  a  bignode.  Observe  that  within 
the  proof  of  Theorem  4  it  was  shown  that  if  x  is  a  bignode  then,  since  Jfe  >  3,  and 
n</2/2  <  p(x)  <  n*/2,  h(x)  >  rW2(  1  -  4/(eln(n)  -  21n(2)))/2.  Therefore,  Theorems 
3  and  4  say  that,  with  probability  tending  to  1,  bignodes  exist  for  every  H  function 
and  that  all  paths  from  the  root  of  T(H )  to  bignodes  must  contain  at  least  ne/2(  1  — 
4/(eln(n)  —  2ln(2)))/2(ln(n)  +  1)&A  nodes.  Call  a  node  at  depth  n*/2(  1  —  4/(eln(n)  — 
2  ln(2)))/2(ln(n)  +  l)k\  a  bottomnode.  All  bignodes  must  be  descendents  of  bottomnodes. 
Hence  at  least  one  bottomnode  exists  in  T(H).  The  next  two  theorems  tell  us  that, 
for  any  H  function,  the  number  of  bottomnodes  in  T(H)  is  exponential  with  probability 
tending  to  1  if  A  is  fixed  and  A  >  ln(2)/(— ln(l  —  2-fc)).  Theorem  8  says  that  the  number 
of  bottomnodes  is  superpolynomial  with  probability  tending  to  1  if  A  =  o(n1/,lnln^n^)  and 
A  >  ln(2)/(—  ln(l  -  2"*))  for  all  n  >  0. 

Theorem  5: 

Let  u)(n)  be  any  function  of  n  that  tends  to  0  asymptotically  and  is  such  that 
limn_00  nu,(n)  =  00.  Let  e(n)  =  l/(ln(e2fcfc+2  A(n))  +  3).  The  following  statement  holds 
with  probability  tending  to  1.  If  A(n)  =  o(nu,(n))  and  A(?  )  >  ln(2)/(-ln(l  -  2-fc)) 
for  all  n  >  0  then,  for  all  H  functions,  on  every  path  from  the  root  of  T(H)  to  a  bot¬ 
tomnode  there  are  at  least 

n<(")/4(i  _  4/ln(nf<")))/(ln(n)  +  l)JfeA(n) 
nodes  x  such  that  both  children  of  x  are  ancestors  of  bottomnodes. 
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Proof: 

The  restriction  on  A  causes  random  instances  of  k- SAT  to  be  unsatisfiable  with  prob¬ 
ability  tending  to  1  so  in  what  follows  we  can  skip  over  the  cases  where  the  required 
trees  don’t  exist  and  consider  only  those  trees  which  are  refutation  trees  (that  is,  the 
trees  in  which  all  leaves  are  labeled  with  clause  labels).  Consider  any  path  P  from 
root  to  bottomnode  in  T(H).  For  some  nodes  on  P  both  children  are  ancestors  of 
bottomnodes  and  for  the  remaining  nodes  on  P  exactly  one  child  is  an  ancestor  of 
a  bottomnode  (we  call  the  other  child  an  Orphan  and  its  subtree  an  Orphaned  sub¬ 
tree).  Call  nodes  of  the  first  kind  Binary  and  nodes  of  the  second  kind  Unary.  The 
Capital  letters  distinguish  Binary  and  Unary  nodes  from  ordinary  binary  and  unary 
nodes  of  a  search  tree.  We  use  the  terms  Binary  and  Unary  because  Binary  nodes 
have  two  connections  to  subtrees  containing  bottomnodes  and  Unary  nodes  have  only 
one.  It  will  be  understood  in  what  follows  that  Binary  and  Unary  nodes  are  on  P  and 
that  Orphans  and  Orphaned  subtrees  are  attached  to  Unary  nodes.  Let  Pp  denote 
the  number  of  distinct  clauses  labeling  the  leaves  of  Orphaned  subtrees. 

Suppose  that  Pp  <  ra*/4.  All  of  the  clause  labels  associated  with  each  edge  connecting 
an  Orphan  node  with  a  Unary  are  different  from  the  clause  labels  associated  with  all 
other  edges  connecting  Orphan  nodes  to  Unary  nodes  since  a  clause  label  associated 
with  any  edge  cannot  appear  below  the  sibling  of  its  endpoint.  But  Pp  <  n't*  so 
the  number  of  Unary  nodes  is  less  than  n*/4.  Therefore,  there  can  be  no  more  than 
n*/4  Unary  nodes  on  the  path  from  root  to  bottomnode.  Since  the  number  of  Binary 
nodes  is  the  number  of  nodes  on  P  minus  the  number  of  Unary  nodes,  and  since  the 
number  of  nodes  on  P  is  n</2  (1  -  (4/(ln(n*)  -  21n(2)))  /2(ln(n)  +  l)JfcA,  the  number 
of  Binary  nodes  on  P  must  be  at  least 

n</2(l-4/(ln(n<)-2ln(2)))  _  </4  tW4  (1  -  4/ln(n«)) 

2(ln(n)  +  l)fcA  (ln(n)  +  l)fcA 

for  sufficiently  large  n,  and  the  theorem  holds. 

Now  suppose  that  Pp  >  n*/4.  We  k  now  that  no  Orphaned  subtree  contains  a  bignode. 
Each  Orphaned  subtree  cannot  contain  more  than  n*/2  distinct  clauses  since  otherwise 
we  could  trace  a  path  through  the  subtree,  as  in  Theorem  4,  and  get  to  a  bignode.  Then 
Pp  <  n *  (an  upper  bound  on  the  product  of  the  pathlength  of  P  and  the  maximum 
number  of  distinct  clauses  below  each  Orphan  of  P).  Let  I(P)  be  the  set  of  distinct 
clauses  labeling  leaves  of  Orphaned  subtrees.  Let  Bp  be  the  set  of  edges  on  P  which 
connect  a  Binary  node  to  its  child  on  P.  Each  literal  in  I(P)  corresponds  to  a  distinct 
edge  label  in  Orphaned  subtrees,  edges  connecting  Unary  nodes  to  their  children,  and 
edges  in  Bp.  Let  hp  denote  the  number  of  distinct  edge  labels  which  are  associated 


only  with  edges  in  Bp  and  are  due  to  leaves  of  all  Orphaned  subtrees.  Recall  that 
T(H)  is  a  refutation  tree  so  Pp  distinct  clauses  labeling  leaves  of  T(H)  generate  kPp 
distinct  edge  labels  in  T(H).  Define  Lp  =  kPp  —  hp.  That  is,  Lp  is  the  number  of 
distinct  edge  labels  which  are  associated  with  edges  connecting  Unary  nodes  to  their 
children  and  edges  within  Orphaned  subtrees  and  are  due  to  leaves  of  all  Orphaned 
subtrees.  Let  Vp  denote  the  set  of  variables  which  are  associated  with  Unary  nodes 
and  nodes  in  Orphaned  subtrees.  See  Figure  2  for  an  example  showing  sets  mentioned 
above.  Figure  2  also  illustrates  sets  mentioned  below. 

In  this  paragraph  we  show  that  Icomm(I(P))  >  Lp  —  |Vp|.  Let  VBDp  denote 
the  set  of  all  variables  that  appear  at  least  two  times  in  I(P)  but  are  not  in  Vp. 
These  variables  are  associated  only  with  Binary  nodes.  Let  LUSp  denote  the  set 
of  edge  labels  which  are  in  edges  that  connect  Unary  nodes  to  their  children  and  are 
associated  with  variables  that  appear  exactly  once  in  7(P).  Since  there  is  one  edge 
label  in  LUSp  for  every  variable  that  is  both  associated  with  a  Unary  node  and  in 
I(P)  exactly  once,  |Vp|  +  \VBDp\  —  \LU  Sp|  is  the  number  of  variables  in  I(P)  which 
are  in  at  least  two  clauses  of  I(P).  Let  LOU  Dp  denote  the  set  of  edge  labels  which 
are  in  Orphaned  subtrees  along  P  or  are  associated  with  edges  that  connect  Unary 
nodes  with  their  children  on  P  and  are  associated  with  variables  that  appear  two  or 
more  times  in  L(P).  Let  LB  Dp  denote  the  set  of  edge  labels  not  in  LOU  Dp  which 
label  edges  in  Bp  and  are  associated  with  variables  that  appear  two  or  more  times  in 
I(P).  Note  that  \LBDP\  >  \VBDp\  since  there  is  at  least  one  edge  label  associated 
with  an  edge  incident  on  a  variable  in  VBDp  which  is  not  in  LOU  Dp.  Recall  that 
Icomm(I(P))  is  the  total  number  of  literals  in  I(P )  that  are  associated  with  variables 
that  appear  two  or  more  times  in  I{P )  minus  the  number  of  such  variables.  Then, 

Icomm(I(P))  =  \LOUDp\  +  \LBDP\ -  \VP\  -  \VBDP\  +  \LUSP\ 

>  \LOUDP\ - \VP\  +  \LUSP\. 

But  | LOU Dp\  +  \LU Sp|  =  Lp  since  LOU DpCiLU Sp  =  $.  It  follows  that  Icomm(I(P ))  > 
Ip-|Vp|. 

Therefore,  hp  >  kPp  —  Icomm(I(P))  -  |Vp|. 

In  this  paragraph  we  show  that  \Vp\  <  Pp.  Create  a  forest  T'  from  T(H )  by  removing 
all  nodes  and  edges  from  T(H)  except  the  Unaries,  edges  connecting  Unaries  to  their 
children,  and  Orphaned  subtrees  along  P.  Construct  a  tree  T"  from  T'  by  appending 
each  Unary  to  the  free  edge  of  another  Unary  so  that  all  Unaries  are  in  the  same 
order  as  they  were  on  P .  Retain  all  edge  label  and  variable  associations  that  existed 
originally.  The  number  of  variables  associated  with  nodes  in  T"  is,  by  definition, 
|Vp|  (exactly  the  Unary  nodes  and  Orphaned  subtrees  remain).  Perform  a  depth  first 
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search  on  T"  and  mark  leaves  that  contain  labels  distinct  from  all  other  previously 
marked  leaves.  Eliminate  all  nodes  that  are  not  ancestors  of  marked  nodes,  edges  on 
paths  to  unmarked  leaves  and  the  unmarked  leaves  themselves.  The  result  is  a  tree 
containing  a  number  of  binary  and  unary  nodes  (lower  case  "binary  and  unary  nodes 
are  ordinary  binary  and  unary  nodes).  Call  the  eliminated  edges  that  were  connected 
to  unaries  “missing”  (so  there  is  one  missing  edge  for  each  unary  in  T").  Except  for 
one  Unary,  Unaries  on  P  can  appear  either  as  binaries  in  T"  or  as  unaries  in  T"  with 
a  clause  in  J(P )  labeling  the  missing  edge.  The  exception  is  the  deepest  Unary  in  P. 
This  Unary  is  the  special  unary  node  in  T"  and  possibly  has  no  clause  in  I(P)  labeling 
its  missing  edge.  The  number  of  leaves  remaining  in  T"  is  Pp ,  each  one  representing 
a  distinct  clause.  Except  for  the  special  unary,  a  variable  v  associated  with  a  unary 
node  in  T"  must  be  the  same  as  the  variable  associated  with  a  previously  visited  node 
in  T" .  This  is  because  some  clause  must  label  the  edge  missing  from  the  unary  and 
the  edge  is  missing  because  the  clause  has  already  been  visited;  but,  since  v  is  in  the 
clause,  a  node  asscociated  with  v  must  have  been  visited.  Consequently,  except  for 
the  variable  associated  with  the  special  unary,  every  variable  in  T"  is  associated  with 
some  binary  in  T" .  The  number  of  binary  nodes  in  T"  is  Pp  —  1.  Hence  the  number 
of  variables  in  T"  (that  is,  |Vp|)  is  at  most  Pp  (after  adding  1  for  the  special  unary). 

Thus,  hp  >  kPp  —  Icomm(I(P))  —  Pp.  We  next  apply  our  familiar  bound  on 
Icomm(I(P)). 

In  what  follows  we  make  statements  which  are  true  for  all  H  applied  to  almost  all 
instances  of  fc-SAT  generated  according  to  M(n,r,k)  if  A(n)  =  o(nu'^n^)  and  A  > 
ln(2)/(  —  ln(l  —  2-fc))  for  all  n>  0;  these  conditions  are  omitted  for  brevity.  Since 
|/(P)|  =  Pp  <  n*  we  can  obtain  from  Theorem  1  that  Icomm{I{P))  <  Pp{\  + 
l/ln(Pp)).  Therefore,  since  k  >  3,  hp  >  [k  —  \)Pp  —  Pp(  1  +  l/ln(Pp))  >  Pp(  1  — 
l/ln(Pp)).  Since  Pp  >  rW4  we  have  that  hp  >  nt//4(l  — 4/ln(n*)).  The  labels  counted 
by  hp  must  be  spread  over  edges  in  Pp.  From  Theorem  3,  no  edge  in  Bp  can  receive 
more  than  (ln(n)  -f  l)hA  labels.  Hence  the  number  of  edges  in  Bp  must  be  at  least 
n*/4(  1  —  4/ ln(n*))/(ln(n)  +  l)hA.  Since  there  is  one  edge  in  Bp  for  every  Binary  node 
on  P,  the  number  of  Binary  nodes  must  be  at  least  n*/4(  1  —  4/ln(ne))/(ln(n)  +  l)k\ 
in  this  case.  This  completes  the  theorem. 

Theorem  6: 

Let  T(H)  be  a  search  tree  generated  by  SRB  for  any  H  function.  If  on  every  path 
from  the  root  of  T(H)  to  a  bottomnode  there  are  at  least  s  nodes  whose  left  child  and 
right  child  are  ancestors  of  bottomnodes  then  T(H)  must  have  at  least  2*  nodes. 

Proof: 
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Compress  T(H)  by  eliminating  all  nodes  from  T(H )  which  are  not  bottomnodes  or 
do  not  have  two  children  which  are  ancestors  of  bottomnodes.  The  result  is  a  binary 
tree  of  depth  at  least  s.  The  number  of  nodes  in  such  a  tree  is  at  least  2* . 

The  following  two  theorems  state  the  main  results. 

Theorem  7: 

The  following  statement  holds  with  probability  tending  to  1.  For  all  H  functions,  SRB 
requires  0(2n“)  time,  a  >  0,  under  M(n,r,k)  for  any  fixed  A  >  ln(2)/(- ln(l  -  2~fc)). 

Proof: 

Follows  from  Theorem  5,  Theorem  6  where  s  in  Theorem  6  is 

n</4(  1  -  4/ln(n*))/(ln(n)  +  l)JfeA 

and  e  =  l/(ln(Ae2A:fc+2)  +  3)  and  the  fact  that  each  node  of  T(H)  must  be  visited  and 
requires  at  least  one  unit  of  time. 

Theorem  8: 

The  following  statement  holds  with  probability  tending  to  1.  For  all  H  functions, 
SRB  requires  superpolynomial  time  under  M(n,r,k)  for  all  functions  A(n)  satisfying 
0(nl/lnln(n))  =  A(n),  lim,,-^  A(n)  =  OO. 

Proof: 

As  in  Theorem  7,  the  number  of  nodes  in  T(H )  is  at  least 

(  !-«/  1"  n*  \  _«/« 

2Wl+ln(»))ii  )n 

(  Inf") 

=  2  V  *  ln(n)A  +  e(kA)  /  n 

(  1  -•(!"(»))/  ^-(4  ln(A))~  1(l-€>((«ln(A))-M) 

=  2 v  k  ln(n)A  ) 

But  0(ln(A))/lnn  =  o(l/lnlnn)  so  the  last  term  is 

2©((ln(n)*r,)n®(<ln(A)rl> 

But  A(n)  =  so  the  last  term  is 

2©((ln(n)A)-1)nl/‘’(,"1"(-)/l"(-))  _  ^  /„((,„(  n))- -  » ) 

which  grows  too  fast  to  be  polynomial. 

According  to  Theorem  8  superpolynomial  time  is  achieved  for  a  rather  large  range 
of  relationships  of  n  to  r.  For  example,  we  get  superpolynomial  time,  almost  always, 
if  n/r(n)  =  (ln(n))^  for  any  constant  f3.  We  have  shown  superpolynomial  time,  almost 
always,  for  n/r(n)  almost  as  high  as  which  is  very  nearly  n  to  a  constant  power. 


3.  Davis-Putnam  Procedure 

As  stated  in  the  introduction,  the  Davis-Putnam  Procedure  contains  three  principal  com¬ 
ponents:  decomposing  I  into  two  subinstances  Jj  and  I2 ,  the  unit-clause  rule  and  the 
pure-literal  rule.  It  is  the  pure-literal  rule  that  appears  to  be  preventing  the  analysis 
from  carrying  over.  However  we  can  show  that  for  the  Davis-Putnam  Procedure  with  any 
heuristic  function  there  is  a  Search  Rearrangement  Backtracking  algorithm  which  expands 
fewer  nodes  of  the  search  tree  when  inputs  are  unsatisfiable.  This  means  that  our  result 
holds  also  for  the  Davis-Putnam  Procedure. 

To  see  this,  first  develop  a  search  tree  for  a  given  Davis-Putnam  Procedure  and  input 
I  which  is  patterned  after  the  search  tree  developed  for  SRB  (that  is,  nodes  are  associated 
with  variables,  leaves  are  labeled  with  clause  names,  edges  are  associated  with  leaf  labels, 
etc.).  Observe  any  node  x  in  the  tree  which  corresponds  to  a  point  in  the  application  of 
the  Davis-Putnam  Procedure  where  the  pure-literal  rule  is  used  for  the  last  time  before 
backing  up.  That  node  has  only  one  child  which  we  denote  by  y.  Create  another  child  z 
of  x  and  subtree  under  z  which  is  exactly  the  same  as  the  subtree  under  y.  Because  the 
pure-literal  rule  is  applied  at  x,  the  edge  (x,z)  has  no  label  associated  with  it.  Therefore, 
we  can  replace  the  subtree  rooted  at  x  with  the  subtree  rooted  at  z  and  still  have  a  search 
tree  corresponding  to  a  verification  of  unsatisfiability;  the  difference  is  that  there  is  one 
less  application  of  the  pure-literal  rule  and  the  tree  has  fewer  nodes.  Continuing  this 
until  all  pure-literal  applications  are  removed  results  in  a  search  tree  corresponding  to 
a  Search  Rearrangement  Backtracking  algorithm  applied  to  I.  This  tree  is  smaller  than 
the  one  corresponding  to  the  application  of  the  Davis-Putnam  Procedure  on  I  and  has  the 
property  that  there  is  at  least  one  clause  label  associated  with  every  edge.  Therefore,  the 
results  of  the  last  section  apply  to  any  form  of  Davis-Putnam  Procedure. 


4.  Conclusions 

We  have  presented  an  analysis  which  shows  that  any  form  of  Search  Rearrangement  Back¬ 
tracking  requires  0( 2n  )  time,  a  >0,  for  almost  all  instances  of  fc-SAT  generated  according 
to  M(n,r,  fc)  if  n/r  =  A  where  A  is  fixed  and  is  such  that  almost  all  instances  are  unsatisfi¬ 
able.  We  have  also  shown  superpolynomial  running  time  for  almost  all  instances  of  fc-SAT 
if  A(n)  =  o(n1/,,nlll(n))  and  lim,,-^  A(n)  =  00.  The  proof  of  this  is  interesting  because  it 
is  based  on  a  structural  property  of  instances  of  fc-SAT.  We  have  shown  that  these  results 
also  apply  to  any  form  of  the  Davis-Putnam  Procedure. 
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Lemma: 

In-*}  —  Iir  w^ere  {}  denotes  Stirling  numbers  of  the  second  kind. 

Proof: 

By  induction  on  n. 

Basis:  {\}  =  1  =  l°/0!. 

Induction  Step:  From  the  definition  of  Stirling  numbers  of  the  second  kind 


(  n  }=(n-x){n  H  +  j  n  1  } 

l^n  —  x J  —  zj  fn  —  z  —  1 J 
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(n  -  1)2(*-1)  (n  _  l)2z 

<  (n  -  x)- — — - — - h  - — — -  by  hypothesis 
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The  lemma  holds  if  the  term  within  brackets  is  less  than  or  equal  to  1.  We  show  this 
as  follows.  Rewrite  (I~ )2*  as  e2*bl^1~1/nh  Notice  that 
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Therefore,  the  term  in  brackets  is  less  than  e2zhl^  1/r*)*e  acln(1  i/n)_ezin(i  i/n)  <- 

1. 
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Figure  1:  A  subtree  of  7>  rooted  at  *  showing  leaf  labels  beneath  the  leaves  and  clause 
labels  associated  with  edges  (literals  are  implied).  Variables  associated  with  nodes  are 
shown  inside  the  nodes.  For  this  subtree  /(*)  =  (t'i,i;2.i'«)(v3,t>5,  v6)(t>2,  V3,t>4)(v2l  vs,  vo)(vi,  vj.vj), 
common(x)  =  7  and  h(x)  =  3*5  —  11  =  4. 


bottomnode 


Binaries  :  These  nodes  are  marked  B 
Unaries  :  These  nodes  are  marked  U 
Orphans  :  These  nodes  are  marked  O 

Vp  :  These  variables  are  associated  with  unmarked  nodes  or  nodes  marked  U,  O  or  I 

VBDp  :  These  variables  are  associated  only  with  nodes  marked  B 

I(P)  :  These  distinct  clauses  label  nodes  marked  I 
Bp  :  These  edges  are  marked  e\ 

LOUDp  :  These  edge  labels  associate  with  ej  edges,  e3  edges  and  some  ea  -  e4  pairs 

LUSp  :  These  edge  labels  associate  with  es  edges  not  in  LOUDp 

LBDp  :  These  edge  labels  associate  only  with  et  edges 


Figure  2:  Illustration  of  terms  used  in  Theorem  5 


